This
lab is about Logistic Regression. Follow the instructions below.
Think hard before you call the instructors!
Download:
zipfile (unzip it in a local folder)
Set Matlab path to include the local folder
1.A Generate a 2-class training set by using the following code
[Xtr, Ytr] = MixGauss([[-1;-1],[1;1]],[0.8,0.8],7);
[Xts, Yts] = MixGauss([[-1;-1],[1;1]],[0.8,0.8],100);
Yts(Yts==2) = -1;
1.C Check how the separating function changes with respect to lambda. Try some regularization parameters, with lambda in [0, 1]. To visualize the separating function (and thus get a more general view of what areas are associated with each class) you may use the routine separatingLinearLR (type "help separatingLinearLR" on the Matlab shell, if you still have doubts on how to use it, have a look at the code).
1.D Check how the confidence function changes with respect to lambda: try some regularization parameters in [0, 1] and plot the result by using the following code.
c = linearLRTrain(Xtr, Ytr, lambda);
confidenceLRLinear(c, Xts, Yts);
Check how the confidence function changes with respect to lambda.
When is it stable with respect to the perturbations of the datasets?
How the computational time of linearLRTrain is affected by the choice of lambda?
1.E Perform the same experiment by using flipped labels (Ytrn = flipLabels(Ytr, p)) with p equal to 0.05 and 0.1.
2.A Generate a 2-class training set by using the following code
[Xtr, Ytr] = MixGauss([[-1;-1],[1;1]],[0.5,0.25],50);
[Xts, Yts] = MixGauss([[-1;-1],[1;1]],[0.5,0.25],100);
Ytr(Ytr==2) = -1;
Yts(Yts==2) = -1;
2.B Have a look at the code of functions kernLRTrain and kernLRTest.
2.C
Check
how the separating function changes with respect to lambda and
sigma. Use the Gaussian kernel (kernel='gaussian')
and try some regularization parameters, with sigma in [0.1, 1] and
lambda in [0, 1]. To visualize the separating function (and thus get
a more general view of what areas are associated with each class)
you may use the routine separatingKernLR
(type
"help
separatingKernLR"
on the Matlab shell, if you still have doubts on how to use it, have
a look at the code).
For example, you could use the following
code:
c
= kernLRTrain(Xtr, Ytr, kernel, sigma, lambda);
figure
separatingKernLR(c, Xtr, kernel, sigma, Xts, Yts);
title({'Kernel Logistic Regression' ; 'Separating function and test samples' ; ['Sigma = ' , num2str(sigma) , ' Lambda = ' , num2str(lambda)] } );
xlabel('X') % x-axis label
ylabel('Y') % y-axis label
Yprob = Yts;
Yprob(Yprob==-1) = 0;
hold on
scatter(Xts(:,1), Xts(:,2), 25, Yprob,'filled');
hold off
2.D Check
how the confidence function
changes with respect to sigma and
lambda. Use the Gaussian kernel (kernel='gaussian')
and try some regularization parameters, with sigma in [0.1, 1] and
lambda in [0, 1].
In order to visualize the confidence
function and the test samples, use the following code:
c = kernLRTrain(Xtr, Ytr, kernel, sigma, lambda);
figure
confidenceLRKern(c, Xtr, kernel, sigma, Xts, Yts);
title({'Kernel Logistic Regression' ; 'Confidence function and test samples' ; ['Sigma = ' , num2str(sigma) , ' Lambda = ' , num2str(lambda)] } );
xlabel('X') % x-axis label
ylabel('Y') % y-axis label
Yprob = Yts;
Yprob(Yprob==-1) = 0;
hold on
scatter(Xts(:,1), Xts(:,2), 25, Yprob,'filled');
hold off
Check how the confidence function changes with respect to lambda.
When is it stable with respect to the perturbations of the datasets?
How the computational time of linearLRTrain is affected by the choice of lambda?
2.E Perform the same experiment by using flipped labels (Ytrn = flipLabels(Ytr, p)) with p equal to 0.05 and 0.1.
2.F Load the “Two moons” dataset by using the command
[Xtr, Ytrn, Xts, Ytsn] = two_moons(npoints, pflipped)
where npoints is the number of points in the dataset (between 1 and 100) and pflipped is the fraction of flipped labels. Then visualize the training and the test set by the following lines.
scatter(Xtr(:,1), Xtr(:,2), 25, Ytrn, 'filled')
figure;
scatter(Xts(:,1), Xts(:,2), 25, Ytsn, 'filled')
2.G Perform the exercises 2.C, 2.D on the “Two moons” dataset.
3.A By using the dataset in 2.F with 100 points and p = 0.05 flipped labels, select the suitable lambda, by using holdoutCVKernLR (see help holdoutCVKernLR for more information), and the parameters
intKerPar = 0.3;
intLambda = [1, 0.5, 0.2, 0.1, 0.05, 0.02, 0.01, 0.005, 0.002, 0.001, 0.0001,0.00001,0.000001];
nrip = 7;
perc = 0.5;
Then plot the validation error and the test error with respect to the choice of lambda by the following code. (The x-axis has a logarithmic scale)
figure
semilogx(intLambda, Vm, 'b');
hold on
semilogx(intLambda, Tm, 'r');
xlabel('\lambda') % x-axis label
ylabel('Median error') % y-axis label
legend('Validation error','Training error');
3.B Perform the same experiment for different fractions of flipped labels (0.0, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5). Check how the training and validation error change with different p.
3.C Now select the suitable sigma, by using holdoutCVKernLR, on the following collection, as in exercise 3.A (with 100 points and 0.05 flipped labels). Then plot the validation and test error.
intKerPar = [10, 5, 2, 1, 0.7, 0.5, 0.3, 0.2, 0.1, 0.07 0.05];
intLambda = 0.001;
nrip = 21;
perc = 0.5;
3.D Perform the same experiment for different fractions of flipped labels (0.0, 0.05, 0.1, 0.15, 0.2). Check how the training and validation errors change with different p.
3.E Now select the best lambda and sigma, by using holdoutCVKernLR, on the following collection, as in exercise 3.C (with 100 points and 0.05 flipped labels ).
intLambda = [1, 0.1, 0.01, 0.001, 0.0001, 0.00001, 0.000001];
intKerPar = [10, 5, 1.0, 0.5, 0.3, 0.1, 0.01];
nrip = 7;
perc = 0.5;
Then plot the separating and the confidence function computed with the best lambda and sigma you have found. (use separatingKernLR and confidenceLRKern).
3.F Compute the best lambda and sigma, and plot the related separating functions with 0%, 5%, 10%, 20% of flipped labels. How do the parameters differ, and the curves?
4.A Repeat the experiment in section 3, with less points (70, 50, 30, 20) and 5% of flipped labels.
How do the parameters vary with respect to the number of points?
4.B Repeat the experiment in section 2 with the polynomial kernel (kernel = 'polynomial') and with parameters lambda in the interval [10, 0] and the exponent q of the polynomial kernel, in {10,9,...,1}.
4.C Perform the Exercise 3.F with the polynomial kernel and the following range of parameters.
intKerPar = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
intLambda = [1, 0.1, 0.01, 0.001, 0.0001, 0.00001, 0.000001];
What is the best exponent for the polynomial kernel on this problem? Why?
4.D Analyze the eigenvalues of the kernel matrix (use the KernelMatrix function for computing it) for a polynomial kernel with different values of q (plot them by using semilogy). What happens with different q? Why?